Search CORE

7 research outputs found

A Randomized Kernel-Based Secret Image Sharing Scheme

Author: Akella Ravi Tej
Pankajakshan Vinod
Rekula Raviteja
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/09/2018
Field of study

This paper proposes a (

k,n

)-threshold secret image sharing scheme that offers flexibility in terms of meeting contrasting demands such as information security and storage efficiency with the help of a randomized kernel (binary matrix) operation. A secret image is split into

n

shares such that any

k

or more shares (

k\leq n

) can be used to reconstruct the image. Each share has a size less than or at most equal to the size of the secret image. Security and share sizes are solely determined by the kernel of the scheme. The kernel operation is optimized in terms of the security and computational requirements. The storage overhead of the kernel can further be made independent of its size by efficiently storing it as a sparse matrix. Moreover, the scheme is free from any kind of single point of failure (SPOF).Comment: Accepted in IEEE International Workshop on Information Forensics and Security (WIFS) 201

arXiv.org e-Print Archive

Crossref

Deep Bayesian Quadrature Policy Optimization

Author: Anandkumar Anima
Azizzadenesheli Kamyar
Ghavamzadeh Mohammad
Ravi Tej Akella
Yue Yisong
Publication venue
Publication date: 28/06/2020
Field of study

We study the problem of obtaining accurate policy gradient estimates using a finite number of samples. Monte-Carlo methods have been the default choice for policy gradient estimation, despite suffering from high variance in the gradient estimates. On the other hand, more sample efficient alternatives like Bayesian quadrature methods are less scalable due to their high computational complexity. In this work, we propose deep Bayesian quadrature policy gradient (DBQPG), a computationally efficient high-dimensional generalization of Bayesian quadrature, for policy gradient estimation. We show that DBQPG can substitute Monte-Carlo estimation in policy gradient methods, and demonstrate its effectiveness on a set of continuous control benchmarks. In comparison to Monte-Carlo estimation, DBQPG provides (i) more accurate gradient estimates with a significantly lower variance, (ii) a consistent improvement in the sample complexity and average return for several deep policy gradient algorithms, and, (iii) the uncertainty in gradient estimation that can be incorporated to further improve the performance

Deep Bayesian Quadrature Policy Optimization

Author: Anandkumar Anima
Azizzadenesheli Kamyar
Ghavamzadeh Mohammad
Tej Akella Ravi
Yue Yisong
Publication venue
Publication date: 28/06/2020
Field of study

We study the problem of obtaining accurate policy gradient estimates using a finite number of samples. Monte-Carlo methods have been the default choice for policy gradient estimation, despite suffering from high variance in the gradient estimates. On the other hand, more sample efficient alternatives like Bayesian quadrature methods have received little attention due to their high computational complexity. In this work, we propose deep Bayesian quadrature policy gradient (DBQPG), a computationally efficient high-dimensional generalization of Bayesian quadrature, for policy gradient estimation. We show that DBQPG can substitute Monte-Carlo estimation in policy gradient methods, and demonstrate its effectiveness on a set of continuous control benchmarks. In comparison to Monte-Carlo estimation, DBQPG provides (i) more accurate gradient estimates with a significantly lower variance, (ii) a consistent improvement in the sample complexity and average return for several deep policy gradient algorithms, and, (iii) the uncertainty in gradient estimation that can be incorporated to further improve the performance.Comment: Conference paper: AAAI-21. Code available at https://github.com/Akella17/Deep-Bayesian-Quadrature-Policy-Optimizatio

arXiv.org e-Print Archive

Caltech Authors

Association for the Advancement of Artificial Intelligence: AAAI Publications

Reasoning with Latent Diffusion in Offline Reinforcement Learning

Author: Akella Ravi Tej
Berseth Glen
Dolan John
Khaitan Shivesh
Schneider Jeff
Venkatraman Siddarth
Publication venue
Publication date: 12/09/2023
Field of study

Offline reinforcement learning (RL) holds promise as a means to learn high-reward policies from a static dataset, without the need for further environment interactions. However, a key challenge in offline RL lies in effectively stitching portions of suboptimal trajectories from the static dataset while avoiding extrapolation errors arising due to a lack of support in the dataset. Existing approaches use conservative methods that are tricky to tune and struggle with multi-modal data (as we show) or rely on noisy Monte Carlo return-to-go samples for reward conditioning. In this work, we propose a novel approach that leverages the expressiveness of latent diffusion to model in-support trajectory sequences as compressed latent skills. This facilitates learning a Q-function while avoiding extrapolation error via batch-constraining. The latent space is also expressive and gracefully copes with multi-modal data. We show that the learned temporally-abstract latent space encodes richer task-specific information for offline RL tasks as compared to raw state-actions. This improves credit assignment and facilitates faster reward propagation during Q-learning. Our method demonstrates state-of-the-art performance on the D4RL benchmarks, particularly excelling in long-horizon, sparse-reward tasks

arXiv.org e-Print Archive